





























- Read hits (I\$ and D\$)
  - This is what we want!
- Write hits (D\$ only)
  - Require the cache and memory to be consistent:
    - Always write the data into both the cache block and the next level in the memory hierarchy (*write-through*).
    - Writes run at the speed of the next level in the memory hierarchy – slow – or can use a *write buffer* and stall only if the write buffer is full.
  - Allow cache and memory to be inconsistent:
    - Write the data only into the cache block (*write-back* the cache block to the next level in the memory hierarchy when that cache block is "evicted").
    - Need a *dirty* bit for each cache block to tell if it needs to be written back to memory when it is evicted – can use a *write buffer* to help buffer write-backs of dirty blocks.





- Read misses (I\$ and D\$)
  - Processed the same as for single word blocks a miss returns the entire block from memory.
  - Miss penalty grows as block size grows:
    - Early restart processor resumes execution as soon as the requested word of the block is returned.
    - Requested word first requested word is transferred from the memory to the cache (and processor) first.
  - Non-blocking cache allows the processor to continue to access the cache while the cache is handling an earlier miss.
- Write misses (D\$)
  - If using write-allocate must *first* fetch the block from memory and then write the word to the block.





What if the processor clock rate is doubled (doubling the miss penalty)?





